Search CORE

201 research outputs found

Lost in parameter space: A road map for Stacks

Author: Catchen JM
Paris JR
Stevens JR
Publication venue: 'Wiley'
Publication date: 18/04/2017
Field of study

PublishedThis is the author accepted manuscript. The final version is available from Wiley via the DOI in this record.1.Restriction site-Associated DNA sequencing (RAD-seq) has become a widely adopted method for genotyping populations of model and non-model organisms. Generating a reliable set of loci for downstream analysis requires appropriate use of bioinformatics software, such as the program stacks. 2.Using three empirical RAD-seq datasets, we demonstrate a method for optimising a de novo assembly of loci using stacks. By iterating values of the program's main parameters and plotting resultant core metrics for visualisation, researchers can gain a much better understanding of their dataset and select an optimal set of parameters; we present the 80% rule as a generally effective method to select the core parameters for stacks. 3.Visualisation of the metrics plotted for the three RAD-seq datasets shows that they differ in the optimal parameters that should be used to maximise the amount of available biological information. We also demonstrate that building loci de novo and then integrating alignment positions is more effective than aligning raw reads directly to a reference genome. 4.Our methods will help the community in honing the analytical skills necessary to accurately assemble a RAD-seq dataset.This work was co-funded by the Environment Agency, Westcountry Rivers Trust and the University of Exeter. Overseas collaboration for the project was made possible by funding from The Genetics Society, Santander and the University of Exeter. Thank you to many RAD-seq workshop participants for invaluable insight and new ideas. We thank Dr Nicolas Rochette for his insights into parameter analysis. Thanks also to Dr Andy King for assistance with the brown trout data molecular work and analysis, and Guy Freeman and Martin Young for the species illustrations. Prof Peter Kille and Dr Luis Cunha, Cardiff School of Biosciences, Cardiff University, kindly provided the reference genome of L. rubellus

Open Research Exeter

TagDigger: user-friendly extraction of read counts from GBS and RAD-seq data

Author: C Heffelfinger
DAR Eaton
Erik J. Sacks
F Lu
GP Morris
J Zohren
JA Poland
JC Glaubitz
JM Catchen
JW Davey
JW Davey
KG Dodds
Lindsay V. Clark
LV Clark
LV Clark
LV Clark
PA Hohenlohe
PD Blischak
R Nielsen
S Liu
SR Narum
SW Baxter
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Screening synteny blocks in pairwise genome comparisons through integer programming

Author: Andrew H Paterson
BJ Haas
Brent Pedersen
C Simillion
C Simillion
C Soderlund
E Lyons
E Lyons
Eric Lyons
G Tesler
H Tang
H Tang
Haibao Tang
HW Six
James C Schnable
JE Bowers
JM Aury
JM Catchen
K Yogeeswaran
L Cui
M Kellis
Michael Freeling
O Jaillon
O Jaillon
P Pevzner
Q Peng
R Warren
RM Karp
S Schwartz
SF Altschul
W Miller
WJ Kent
X Wang
Y Van de Peer
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background It is difficult to accurately interpret chromosomal correspondences such as true orthology and paralogy due to significant divergence of genomes from a common ancestor. Analyses are particularly problematic among lineages that have repeatedly experienced whole genome duplication (WGD) events. To compare multiple "subgenomes" derived from genome duplications, we need to relax the traditional requirements of "one-to-one" syntenic matchings of genomic regions in order to reflect "one-to-many" or more generally "many-to-many" matchings. However this relaxation may result in the identification of synteny blocks that are derived from ancient shared WGDs that are not of interest. For many downstream analyses, we need to eliminate weak, low scoring alignments from pairwise genome comparisons. Our goal is to objectively select subset of synteny blocks whose total scores are maximized while respecting the duplication history of the genomes in comparison. We call this "quota-based" screening of synteny blocks in order to appropriately fill a quota of syntenic relationships within one genome or between two genomes having WGD events. Results We have formulated the synteny block screening as an optimization problem known as "Binary Integer Programming" (BIP), which is solved using existing linear programming solvers. The computer program QUOTA-ALIGN performs this task by creating a clear objective function that maximizes the compatible set of synteny blocks under given constraints on overlaps and depths (corresponding to the duplication history in respective genomes). Such a procedure is useful for any pairwise synteny alignments, but is most useful in lineages affected by multiple WGDs, like plants or fish lineages. For example, there should be a 1:2 ploidy relationship between genome A and B if genome B had an independent WGD subsequent to the divergence of the two genomes. We show through simulations and real examples using plant genomes in the rosid superorder that the quota-based screening can eliminate ambiguous synteny blocks and focus on specific genomic evolutionary events, like the divergence of lineages (in cross-species comparisons) and the most recent WGD (in self comparisons). Conclusions The QUOTA-ALIGN algorithm screens a set of synteny blocks to retain only those compatible with a user specified ploidy relationship between two genomes. These blocks, in turn, may be used for additional downstream analyses such as identifying true orthologous regions in interspecific comparisons. There are two major contributions of QUOTA-ALIGN: 1) reducing the block screening task to a BIP problem, which is novel; 2) providing an efficient software pipeline starting from all-against-all BLAST to the screened synteny blocks with dot plot visualizations. Python codes and full documentations are publicly available <url>http://github.com/tanghaibao/quota-alignment</url>. QUOTA-ALIGN program is also integrated as a major component in SynMap <url>http://genomevolution.com/CoGe/SynMap.pl</url>, offering easier access to thousands of genomes for non-programmers.</p

Crossref

DigitalCommons@University of Nebraska

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The University of Arizona

eScholarship - University of California

Restriction associated DNA-genotyping at multiple spatial scales in Arabidopsis lyrata reveals signatures of pathogen-mediated selection

Author: A Amambua-Ngwa
A Fijarczyk
A Gouin
A Sicard
A Stamatakis
AF Bent
AL Pais
B Charlesworth
B Langmead
B Pfeifer
B Steuernagel
Barbara K. Mable
BK Mable
BK Mable
BK Mable
BK Mable
BS Weir
C Kamdem
D Koenig
D Tian
D Weigel
DB Lowry
EA Stahl
EB Holub
EB Holub
EG Bakker
Eric B. Holub
F Tajima
G Friis
G Gos
G Luikart
J Bergelson
J Buckley
J Catchen
J Catchen
J Ding
J Ellis
J Li
J Lighten
J Sperschneider
J Staal
James Buckley
JC Thomas
JE Parker
JM Catchen
JM Cork
JM McDowell
JP Foxe
JW Davey
K Keenan
L Noël
L Rose
LF Delph
LG Spurgin
M Bruneaux
M Croze
M Foll
M Grant
M Kreitman
M Nei
MA Beaumont
MA Koch
Marcus A. Koch
MH Borhan
MH Muller
MH Schierup
MJ Clauss
MK Sekhwal
N Hohmann
N Hohmann
NA Baird
PA Hohenlohe
PC Sabeti
Philippine Vergeer
PW Hedrick
PW Hedrick
PY Novikova
R Mauricio
R Schmickl
R Schmickl
RJ Haasl
RM Clark
S Asthana
S Dray
T Jombart
T Jombart
TL Karasov
TT Hu
X Gou
Y Willi
YL Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Background: Genome scans based on outlier analyses have revolutionized detection of genes involved in adaptive processes, but reports of some forms of selection, such as balancing selection, are still limited. It is unclear whether high throughput genotyping approaches for identification of single nucleotide polymorphisms have sufficient power to detect modes of selection expected to result in reduced genetic differentiation among populations. In this study, we used Arabidopsis lyrata to investigate whether signatures of balancing selection can be detected based on genomic smoothing of Restriction Associated DNA sequencing (RAD-seq) data. We compared how different sampling approaches (both within and between subspecies) and different background levels of polymorphism (inbreeding or outcrossing populations) affected the ability to detect genomic regions showing key signatures of balancing selection, specifically elevated polymorphism, reduced differentiation and shifts towards intermediate allele frequencies. We then tested whether candidate genes associated with disease resistance (R-gene analogs) were detected more frequently in these regions compared to other regions of the genome. Results: We found that genomic regions showing elevated polymorphism contained a significantly higher density of R-gene analogs predicted to be under pathogen-mediated selection than regions of non-elevated polymorphism, and that many of these also showed evidence for an intermediate site-frequency spectrum based on Tajima’s D. However, we found few genomic regions that showed both elevated polymorphism and reduced FST among populations, despite strong background levels of genetic differentiation among populations. This suggests either insufficient power to detect the reduced population structure predicted for genes under balancing selection using sparsely distributed RAD markers, or that other forms of diversifying selection are more common for the R-gene analogs tested. Conclusions: Genome scans based on a small number of individuals sampled from a wide range of populations were sufficient to confirm the relative scarcity of signatures of balancing selection across the genome, but also identified new potential disease resistance candidates within genomic regions showing signatures of balancing selection that would be strong candidates for further sequencing efforts

Repository for Publications and Research Data

Crossref

Heidelberger Dokumentenserver

Directory of Open Access Journals

Wageningen University & Research Publications

Plymouth Electronic Archive and Research Library

Edinburgh Research Explorer

Warwick Research Archives Portal Repository

Enlighten

QTL analysis and genomic selection using RADseq derived markers in Sitka spruce: the potential utility of within family data

Author: A. Law
B Nystedt
B Pelgas
C. Goswami
D Grattapaglia
D Grattapaglia
DB Neale
F Isik
G Tuskan
GT Slavov
I Birol
J Beaulieu
J Beaulieu
J Zapata-Valenzuela
J. A. Woolliams
J. E. Cottrell
JC Venter
JM Catchen
JM Olson
JW Davey
K Ritland
KR Andrews
M Hannerz
M Lillehammer
MDV Resende
MDV Resende
MFR Resende
N Amin
P Krutzsch
P Rice
P. Fuentes-Utrilla
PD Etter
PK Gupta
PM Raden Van
R. Pong-Wong
RD Houston
RK Hermann
S Thavamanikumar
S. J. Lee
S. W. A’Hara
SJ Lee
SJ Lee
SJ Lee
SW Baxter
THE Meuwissen
THE Meuwissen
Y Peng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Sitka spruce (Picea sitchensis (Bong.) Carr) is the most common commercial plantation species in Britain and a breeding programme based on traditional lines has been in operation since the early 1960s. Rotation lengths of 40-years have led breeders to adopt a process of indirect selection at younger ages based on traits well correlated with final selection, but still the generation interval is unlikely to reduce much below twenty years. Recent successful developments with genomic selection in animal breeding have led tree breeders to consider the application of this technology. In this study a RAD sequence assay was developed as a means of investigating the potential of molecular breeding in a non-model species. DNA was extracted from nearly 500 clonally replicated trees growing in a single full-sibling family at one site in Britain. The technique proved successful in identifying 132 QTLs for 5-year bud-burst and 2 QTLs for 6-year height. In addition, the accuracy of predicting phenotypes by genomic selection was strikingly high at 0.62 and 0.59 respectively. Sensitivity analysis with 200 offspring found only a slight fall in correlation values (0.54 and 0.38) although when the training population reduced to 50 offspring predictive values fell further (0.33 and 0.25). This proved an encouraging first investigation into the potential use of genomic selection in the breeding of Sitka spruce. The authors investigate how problems associated with effective population size and linkage disequilibrium can be avoided and suggest a practical way of incorporating genomic selection into a dynamic breeding programme

Crossref

Springer - Publisher Connector

Edinburgh Research Explorer

Enlighten

Mapping the sex determination locus in the hāpuku (Polyprion oxygeneios) using ddRAD sequencing

Author: A Dettai
Alvin N. Setiawan
BK Peterson
C Palaiokostas
CB Wakefield
Christos Palaiokostas
D Garcia de la Serrana
David J. Penman
DD Kosambi
DJ Penman
GR Margarido
Jane E. Symonds
JE Symonds
JE Symonds
Jeremy K. Brown
JM Catchen
John B. Taggart
JP Barreiros
JR Gonzalez
JR Khan
K Semagn
M Goddard
M Krzywinski
M Vandeputte
Michaël Bekaert
MP Francis
N Papandroulakis
P Rastas
R Betancur-R
RD Houston
RH Devlin
S Kwok
SA Anderson
SF Altschul
Stefanie Wehner
T Gamble
YY Kohn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Background  Hāpuku (Polyprion oxygeneios) is a member of the wreckfish family (Polyprionidae) and is highly regarded as a food fish. Although adults grow relatively slowly, juveniles exhibit low feed conversion ratios and can reach market size in 1–2 years, makingP. oxygeneiosa strong candidate for aquaculture. However, they can take over 5years to reach sexual maturity in captivity and are not externally sexually dimorphic, complicating many aspects of broodstock management. Understanding the sex determination system ofP. oxygeneiosand developing accurate assays to assign genetic sex will contribute significantly towards its full-scale commercialisation.  Results  DNA from parents and sexed offspring (n = 57) from a single family of captive bredP. oxygeneioswas used as a template for double digestion Restriction-site Associated DNA (ddRAD) sequencing. Two libraries were constructed usingSbfI–SphI andSbfI –NcoI restriction enzyme combinations, respectively. Two runs on an Illumina MiSeq platform generated 70,266,464 raw reads, identifying 19,669 RAD loci. A combined sex linkage map (1367cM) was constructed based on 1575 Single Nucleotide Polymorphism (SNP) markers that resolved into 35 linkage groups. Sex-specific linkage maps were of similar size (1132 and 1168cM for male and female maps respectively). A single major sex-determining locus, found to be heterogametic in males, was mapped to linkage group 14. Several markers were found to be in strong linkage disequilibrium with the sex-determining locus. Allele-specific PCR assays were developed for two of these markers, SphI6331 and SphI8298, and demonstrated to accurately differentiate sex in progeny within the same pedigree. Comparative genomic analyses indicated that many of the linkage groups within theP. oxygeneiosmap share a relatively high degree of homology with those published for the European seabass (Dicentrarchus labrax).  Conclusion  P. oxygeneioshas an XX/XY sex determination system. Evaluation of allele-specific PCR assays, based on the two SNP markers most closely associated with phenotypic sex, indicates that a simple molecular assay for sexingP. oxygeneiosshould be readily attainable. The high degree of synteny observed withD. labraxshould aid further molecular genetic study and exploitation of hāpuku as a food fish

Crossref

Stirling Online Research Repository (RIOXX)

Springer - Publisher Connector

PubMed Central

Edinburgh Research Explorer

Stirling Online Research Repository

Rapid niche expansion by selection on functional genomic variation after ecosystem recovery

Author: A Jacobs
A Roberts
A Siwertsson
ABA Shafer
AF Kautt
AM Bolger
AP Hendry
AP Hendry
B Egger
B Langmead
B Lundsgaard-Hansen
B Zhang
BJG Sutherland
BS Weir
C Camacho
C Harrod
C Rougeux
CC Chang
CP Klingenberg
CP Klingenberg
D Schluter
DH Alexander
E Ahi
E Frichot
E Yohannes
EB Taylor
G Gibson
G Thomas
GE Hoffman
H Mi
H Recknagel
HEL Lischer
J Behrmann-Godel
J Jeukens
J Moore
JA Chaves
JA Hamilton
JBS Haldane
JK Pickrell
JM Bullock
JM Catchen
JM Ranz
K Luu
K Østbye
KE Delmore
L Excoffier
M Carruthers
M Laporte
M Luczynski
M Quevedo
MC Jochimsen
ME Maan
MI Love
P Conejeros
P Danecek
P Langfelder
P Vonlanthen
P Vonlanthen
PA Gagnaire
PD Gingerich
PE Hirsch
PG Meirmans
PJ Park
R Burri
RN Gutenkunst
RT Gilman
S Des Roches
S Lien
S Yeaman
SM Rudman
SP Pfeifer
T Jombart
T Jombart
TE Cruickshank
W Nümann
X Zhou
Y Benjamini
Y Hautier
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/10/2018
Field of study

It is well recognized that environmental degradation caused by human activities can result in dramatic losses of species and diversity. However, comparatively little is known about the ability of biodiversity to re-emerge following ecosystem recovery. Here, we show that a European whitefish subspecies, the gangfisch Coregonus lavaretus macrophthalmus, rapidly increased its ecologically functional diversity following the restoration of Lake Constance after anthropogenic eutrophication. In fewer than ten generations, gangfisch evolved a greater range of gill raker numbers (GRNs) to utilize a broader ecological niche. A sparse genetic architecture underlies this variation in GRN. Several co-expressed gene modules and genes showing signals of positive selection were associated with GRN and body shape. These were enriched for biological pathways related to trophic niche expansion in fishes. Our findings demonstrate the potential of functional diversity to expand following habitat restoration, given a fortuitous combination of genetic architecture, genetic diversity and selection

Enlighten: Research Data (University of Glasgow)

Crossref

Enlighten

Selection of reliable biomarkers from PCR array analyses using relative distance computational model: Methodology and proof-of-concept study

Author: A Cuesta
A Hartwig
A Schnurstein
A Vested
A Yamaguchi
AM Soto
C Mattingly
Chunsheng Liu
E Vindimian
F Ariese
F Lucarelli
F Pomati
G Flouriot
GT Ankley
H Khalaf
H Xu
H Xu
HK Hamadeh
HL Osborn
HM Stapleton
Hongyan Xu
JD Meeker
JM Catchen
JP Wu
JW Nelson
KG Harley
M Bartosiewicz
M Harada
M Murata
M Thomas
MC Caino
MG Simic
N Garcia-Reyero
R Franco
R Lackner
R van der Oost
RA Roberts
Raya Khanin
RP Amin
RT Di Giulio
S Lee
SC Gupta
SE Hook
SH Lam
SH Lam
Siew Hong Lam
W Zheng
Z Zeng
Zhiyuan Gong
ZT Handzel
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

10.1371/journal.pone.0083954PLoS ONE812-POLN

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

FigShare

Comparative Oncogenomic Analysis of Copy Number Alterations in Human and Zebrafish Tumors Enables Cancer Driver Discovery

Author: A Amores
A Amsterdam
A Amsterdam
A Amsterdam
A Faucherre
A Force
A Kozomara
A Walther
Adam Amsterdam
AI McClatchey
AI McClatchey
AW MacInnes
B Vogelstein
BA Weir
C Dai
C Gregorian
Charles A. Whittaker
CT Storlazzi
D Pinkel
D Tautz
DM Langenau
DM Langenau
E Beert
EE Patton
EJ Kauffman
Eline Beert
Eric Legius
ES Venkatraman
ET Sawey
F Sanchez-Garcia
G Chai
G Streisinger
G Zhang
GP Nielsen
GP Nielsen
GuangJun Zhang
H Schmidt
HR Brekke
J Shin
J Tang
J Yu
J Zietsch
JA Jimenez-Heffernan
Jacqueline A. Lees
JH Postlethwait
JM Catchen
JM Woodruff
John H. Postlethwait
Julian M. Catchen
K Cichowski
K Lai
KE Torres
KF Macleod
KH Brown
L Chin
L Zender
L Zender
LA Garraway
LA Rudner
LW Dillon
M Baudis
M Demestre
M Kasahara
M Kim
M Meyerson
M Sheffer
M Smid
MA Mohideen
MA Watson
Marshall S. Horwitz
MC Mione
MD Wallace
MP Ghadimi
MR Stratton
N Holtkamp
N McGranahan
Nancy Hopkins
PJ Stephens
Q Jiang
R Beroukhim
R Beroukhim
RB Phillips
RI Aqeilan
RS Maser
S Berghmans
S Frohling
S Kavumpurath
S Roessler
S Zhu
Sarah Farrington
Sebastian Hoersch
SF Bakhoum
SH Kresse
SJ Kazmi
SL Carter
SL Johnson
SL Johnson
TJ Hulsebos
TM Kim
W Xue
WM Lin
Y Nakagawa
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

The identification of cancer drivers is a major goal of current cancer research. Finding driver genes within large chromosomal events is especially challenging because such alterations encompass many genes. Previously, we demonstrated that zebrafish malignant peripheral nerve sheath tumors (MPNSTs) are highly aneuploid, much like human tumors. In this study, we examined 147 zebrafish MPNSTs by massively parallel sequencing and identified both large and focal copy number alterations (CNAs). Given the low degree of conserved synteny between fish and mammals, we reasoned that comparative analyses of CNAs from fish versus human MPNSTs would enable elimination of a large proportion of passenger mutations, especially on large CNAs. We established a list of orthologous genes between human and zebrafish, which includes approximately two-thirds of human protein-coding genes. For the subset of these genes found in human MPNST CNAs, only one quarter of their orthologues were co-gained or co-lost in zebrafish, dramatically narrowing the list of candidate cancer drivers for both focal and large CNAs. We conclude that zebrafish-human comparative analysis represents a powerful, and broadly applicable, tool to enrich for evolutionarily conserved cancer drivers.Kathy and Curt Marble Cancer Research FundArthur C. MerrillNational Institutes of Health (U.S.) (Grant CA106416)National Institutes of Health (U.S.) (Grant ROI RR020833)National Institutes of Health (U.S.) (Grant 1F32GM095213-01

Directory of Open Access Journals

PubMed Central

MDC Repository

FigShare

Double Digest RADseq: An Inexpensive Method for De Novo SNP Discovery and Genotyping in Model and Non-Model Species

Author: Brant K. Peterson
CM Ramsdell
CP van Tassell
D Altshuler
DA Pollard
DW Craig
EM Kenny
Emily H. Kay
G Lunter
GP Consortium 1000
H Li
H Li
H Li
Heidi S. Fisher
Hopi E. Hoekstra
J Felsenstein
JC Avise
Jesse N. Weber
JL Davey
JM Catchen
KJ Emerson
KW Broman
L Li
L Salmela
LM Turner
Ludovic Orlando
MA Depristo
MA Quail
MA White
MD Carling
N Patterson
NA Baird
NJ van Orsouw
P Andolfatto
PA Hohenlohe
PA Hohenlohe
PA Hohenlohe
PA Hohenlohe
RC Edgar
S Alon
TFC Mackay
WF Dietrich
WF Pfender
WJ Kent
Z Gompert
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

The ability to efficiently and accurately determine genotypes is a keystone technology in modern genetics, crucial to studies ranging from clinical diagnostics, to genotype-phenotype association, to reconstruction of ancestry and the detection of selection. To date, high capacity, low cost genotyping has been largely achieved via “SNP chip” microarray-based platforms which require substantial prior knowledge of both genome sequence and variability, and once designed are suitable only for those targeted variable nucleotide sites. This method introduces substantial ascertainment bias and inherently precludes detection of rare or population-specific variants, a major source of information for both population history and genotype-phenotype association. Recent developments in reduced-representation genome sequencing experiments on massively parallel sequencers (commonly referred to as RAD-tag or RADseq) have brought direct sequencing to the problem of population genotyping, but increased cost and procedural and analytical complexity have limited their widespread adoption. Here, we describe a complete laboratory protocol, including a custom combinatorial indexing method, and accompanying software tools to facilitate genotyping across large numbers (hundreds or more) of individuals for a range of markers (hundreds to hundreds of thousands). Our method requires no prior genomic knowledge and achieves per-site and per-individual costs below that of current SNP chip technology, while requiring similar hands-on time investment, comparable amounts of input DNA, and downstream analysis times on the order of hours. Finally, we provide empirical results from the application of this method to both genotyping in a laboratory cross and in wild populations. Because of its flexibility, this modified RADseq approach promises to be applicable to a diversity of biological questions in a wide range of organisms

CiteSeerX

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

PubMed Central